{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 12-1 Binary search trees with equal keys\n",
    "\n",
    "> Equal keys pose a problem for the implementation of binary search trees."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*a*__. What is the asymptotic performance of TREE-INSERT when used to insert $n$ items with identical keys into an initially empty binary search tree?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\Theta(n)$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*b*__. Keep a boolean flag $x.b$ at node $x$, and set $x$ to either $x.left$ or $x.right$ based on the value of $x.b$, which alternates between FALSE and TRUE each time we visit $x$ while inserting a node with the same key as $x$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\Theta(n \\lg n)$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*c*__. Keep a list of nodes with equal keys at $x$, and insert $z$ into the list."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If linked list is used, it could be $\\Theta(1)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*d*__. Randomly set $x$ to either $x.left$ or $x.right$. (Give the worst-case performance and informally derive the expected running time.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Worst: $\\Theta(n)$.\n",
    "\n",
    "Expected: $\\Theta(n \\lg n)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 12-2 Radix trees\n",
    "\n",
    "> Given two strings $a = a_0a_1 \\dots a_p$ and $b = b_0b_1 \\dots b_q$, where each $a_i$ and each $b_j$ is in some ordered set of characters, we say that string $a$ is lexicographically less\n",
    "than string $b$ if either\n",
    "1. there exists an integer $j$, where $0 \\le j \\le \\min(p, q)$, such that $a_i = b_i$ for all $i = 0, 1, \\dots j - 1$ and $a_j < b_j$, or\n",
    "2. $p < q$ and $a_i = b_i$ for all $i = 0, 1, \\dots, p$.\n",
    "\n",
    "> The radix tree data structure shown in Figure 12.5 stores the bit strings 1011, 10, 011, 100, and 0. When searching for a key $a = a_0a_1 \\dots a_p$, we go left at a node of depth $i$ if $a_i = 0$ and right if $a_i = 1$. Let $S$ be a set of distinct bit strings whose lengths sum to $n$. Show how to use a radix tree to sort $S$ lexicographically in $\\Theta(n)$ time. For the example in Figure 12.5, the output of the sort should be the sequence 0, 011, 10, 100, 1011."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Insert all the bit strings into radix tree costs $\\Theta(n)$, then use preorder tree walk to sort the strings costs $\\Theta(n)$, the total cost is still $\\Theta(n)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "class TreeNode:\n",
    "    def __init__(self, val, left=None, right=None):\n",
    "        self.val = val\n",
    "        self.left = left\n",
    "        self.right = right\n",
    "\n",
    "\n",
    "class RadixTree:\n",
    "    def __init__(self):\n",
    "        self.root = None\n",
    "\n",
    "    def insert(self, a):\n",
    "        self.root = self.insert_rec(self.root, a, 0)\n",
    "\n",
    "    def insert_rec(self, root, a, idx):\n",
    "        if idx == len(a):\n",
    "            if root is None:\n",
    "                return TreeNode(a)\n",
    "            root.val = a\n",
    "            return root\n",
    "        if root is None:\n",
    "            root = TreeNode(None)\n",
    "        if a[idx] == '0':\n",
    "            root.left = self.insert_rec(root.left, a, idx+1)\n",
    "        else:\n",
    "            root.right = self.insert_rec(root.right, a, idx+1)\n",
    "        return root\n",
    "\n",
    "    def sorted(self):\n",
    "        self.sorted_list = []\n",
    "        self.sorted_rec(self.root)\n",
    "        return self.sorted_list\n",
    "\n",
    "    def sorted_rec(self, root):\n",
    "        if root is None:\n",
    "            return\n",
    "        if root.val is not None:\n",
    "            self.sorted_list.append(root.val)\n",
    "        self.sorted_rec(root.left)\n",
    "        self.sorted_rec(root.right)\n",
    "\n",
    "\n",
    "def sort_strings(strs):\n",
    "    radix_tree = RadixTree()\n",
    "    for s in strs:\n",
    "        radix_tree.insert(s)\n",
    "    return radix_tree.sorted()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 12-3 Average node depth in a randomly built binary search tree\n",
    "\n",
    "> In this problem, we prove that the average depth of a node in a randomly built binary search tree with n nodes is $O(\\lg n)$. Although this result is weaker than that of Theorem 12.4, the technique we shall use reveals a surprising similarity between the building of a binary search tree and the execution of RANDOMIZED-QUICKSORT from Section 7.3.\n",
    "\n",
    "> We define the __*total path length*__ $P(T)$ of a binary tree $T$ as the sum, over all nodes $x$ in $T$ , of the depth of node $x$, which we denote by $d(x, T)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*a*__. Argue that the average depth of a node in $T$ is\n",
    "\n",
    "> $\\displaystyle \\frac{1}{n} \\sum_{x \\in T} d(x, T) = \\frac{1}{n} P(T)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Obviously."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Thus, we wish to show that the expected value of $P(T)$ is $O(n \\lg n)$.\n",
    "\n",
    "> __*b*__. Let $T_L$ and $T_R$ denote the left and right subtrees of tree $T$, respectively. Argue that if $T$ has $n$ nodes, then\n",
    "\n",
    "> $P(T) = P(T_L) + P(T_R) + n - 1$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are $n-1$ nodes in $P(T_L)$ and $P(T_R)$, each increase by 1."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*c*__. Let $P(n)$ denote the average total path length of a randomly built binary search tree with n nodes. Show that\n",
    "\n",
    "> $\\displaystyle P(n) = \\frac{1}{n} \\sum_{i=0}^{n-1} (P(i) + P(n-i-1) + n - 1)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The root is equally likely to be the rank in $[1, n]$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*d*__. Show how to rewrite $P(n)$ as\n",
    "\n",
    "> $\\displaystyle P(n) = \\frac{2}{n} \\sum_{k=1}^{n-1} P(k) + \\Theta(n)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each item $P(1), P(2), \\dots, P(n)$ appears twice in the summation, and\n",
    "\n",
    "$$\n",
    "\\frac{1}{n} \\sum_{i=0}^{n-1} n - 1 = \\frac{(n-3)n}{2n} = \\Theta(n)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*e*__. Recalling the alternative analysis of the randomized version of quicksort given in Problem 7-3, conclude that $P(n) = O(n \\lg n)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Based on Problem 7-3,  $P(n) = O(n \\lg n)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*f*__. Describe an implementation of quicksort in which the comparisons to sort a set of elements are exactly the same as the comparisons to insert the elements into a binary search tree."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Choose the pivot that it has the lowest index in the original list."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 12-4 Number of different binary trees\n",
    "\n",
    "> Let $b_n$ denote the number of different binary trees with $n$ nodes. In this problem, you will find a formula for $b_n$, as well as an asymptotic estimate."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*a*__. Show that $b_0 = 1$ and that, for $n \\ge 1$,\n",
    "\n",
    "> $\\displaystyle b_n = \\sum_{k=0}^{n-1} b_k b_{n-1-k}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A root with two subtree."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*b*__. Referring to Problem 4-4 for the definition of a generating function, let $B(x)$ be the generating function\n",
    "\n",
    "> $\\displaystyle B(x) = \\sum_{n=0}^\\infty b_n x^n$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{array}{rll}\n",
    "B(x)^2 &=& (b_0 x^0 + b_1 x^1 + b_2 x^2 + \\dots) ^ 2 \\\\\n",
    "&=& b_0^2 x^0 + (b_0 b_1 + b_1 b_0) x^1 + (b_0 b_2 + b_1 b_1 + b_2 b_0) x^2 + \\dots \\\\\n",
    "&=& \\displaystyle \\sum_{k=0}^0 b_k b_{0-k} x^0 + \\sum_{k=0}^1 b_k b_{1-k} x^1 + \\sum_{k=0}^2 b_k b_{2-k} x^2 + \\dots \\\\\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\begin{array}{rll}\n",
    "xB(x)^2 + 1 &=& \\displaystyle 1 + \\sum_{k=0}^0 b_k b_{1-1-k} x^1 + \\sum_{k=0}^2 b_k b_{2-1-k} x^3 + \\sum_{k=0}^2 b_k b_{3-1-k} x^2 + \\dots \\\\\n",
    "&=& 1 + b_1 x^1 + b_2 x^2 + b_3 x^3 + \\dots \\\\\n",
    "&=& b_0 x^0 + b_1 x^1 + b_2 x^2 + b_3 x^3 + \\dots \\\\\n",
    "&=& \\displaystyle \\sum_{n=0}^\\infty b_n x^n \\\\\n",
    "&=& B(x)\n",
    "\\end{array}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Show that $B(x) = x B(x)^2 + 1$, and hence one way to express $B(x)$ in closed form is\n",
    "\n",
    "> $\\displaystyle B(x) = \\frac{1}{2x} (1 - \\sqrt{1 - 4x})$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{array}{rll}\n",
    "x B(x)^2 + 1 &=& \\displaystyle x \\cdot \\frac{1}{4x^2} (1 + 1 - 4x - 2\\sqrt{1 - 4x}) + 1 \\\\\n",
    "&=& \\displaystyle \\frac{1}{4x} (2 - 2\\sqrt{1 - 4x}) - 1 + 1 \\\\\n",
    "&=& \\displaystyle \\frac{1}{2x} (1 - \\sqrt{1 - 4x}) \\\\\n",
    "&=& B(x)\n",
    "\\end{array}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> The __*Taylor expansion*__ of $f(x)$ around the point $x=a$ is given by\n",
    "\n",
    "> $\\displaystyle f(x) = \\sum_{k=0}^\\infty \\frac{f^{(k)}(a)}{k!} (x - a)^k$,\n",
    "\n",
    "> where $f^{(k)}(a)$ is the $k$th derivative of $f$ evaluated at $x$.\n",
    "\n",
    "> __*c*__. Show that\n",
    "\n",
    "> $\\displaystyle b_n = \\frac{1}{n+1} \\binom{2n}{n}$\n",
    "\n",
    "> (the $n$th Catalan number) by using the Taylor expansion of $\\sqrt{1-4x}$ around $x=0$. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $f(x) = \\sqrt{1-4x}$,\n",
    "\n",
    "The numerator of the derivative is \n",
    "\n",
    "$$\\begin{array}{rll}\n",
    "2 \\cdot (1 \\cdot 2) \\cdot (3 \\cdot 2) \\cdot (5 \\cdot 2)\\cdots &=&\n",
    "\\displaystyle 2^k \\cdot \\prod_{i=0}^{k - 2}(2k+1) \\\\\n",
    "&=& 2^k \\cdot \\frac{\\displaystyle (2(k-1))!}{\\displaystyle 2^{k-1}(k-1)!} \\\\\n",
    "&=& \\frac{\\displaystyle 2(2(k-1))!}{\\displaystyle (k-1)!} \\\\\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "$$f(x) = 1 - 2x - 2x^2 - 4 x^3 - 10x^4 - 28x^5 - \\dots$$\n",
    "\n",
    "The coefficient is $\\frac{\\displaystyle 2(2(k-1))!}{\\displaystyle k!(k-1)!}$.\n",
    "\n",
    "$$\\begin{array}{rll}\n",
    "B(x) &=& \\displaystyle \\frac{1}{2x}(1-f(x)) \\\\\n",
    "&=& 1 + x + 2x^2 + 5x^3 + 14x^4 + \\dots \\\\\n",
    "&=& \\displaystyle \\sum_{n=0}^\\infty \\frac{\\displaystyle (2n)!}{\\displaystyle (n+1)!n!} x \\\\\n",
    "&=& \\displaystyle \\sum_{n=0}^\\infty \\frac{1}{n+1} \\frac{\\displaystyle (2n)!}{\\displaystyle n!n!} x \\\\\n",
    "&=& \\displaystyle \\sum_{n=0}^\\infty \\frac{1}{n+1} \\binom{2n}{n} x\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "Therefore $\\displaystyle b_n = \\frac{1}{n+1} \\binom{2n}{n}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*d*__. Show that\n",
    "\n",
    "> $\\displaystyle b_n = \\frac{4^n}{\\sqrt{\\pi}n^{3/2}} (1 + O(1/n))$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Based on Stirling's approximation $\\displaystyle n! \\approx \\sqrt{2 \\pi n} \\left ( \\frac{n}{e} \\right )^n$,\n",
    "\n",
    "$$\n",
    "\\begin{array}{rll}\n",
    "b_n &=& \\displaystyle \\frac{1}{n+1} \\frac{\\displaystyle (2n)!}{\\displaystyle n!n!} \\\\\n",
    "&\\approx& \\displaystyle \\frac{1}{n+1} \\frac{\\sqrt{4 \\pi n}(2n / e)^{2n}}{2 \\pi n (n/e)^{2n}} \\\\\n",
    "&=& \\displaystyle \\frac{1}{n+1} \\frac{4^n}{\\sqrt{\\pi n} } \\\\\n",
    "&=& \\displaystyle \\left ( \\frac{1}{n} + \\left ( \\frac{1}{n+1} - \\frac{1}{n} \\right ) \\right ) \\frac{4^n}{\\sqrt{\\pi n} } \\\\\n",
    "&=& \\displaystyle \\left ( \\frac{1}{n} - \\frac{1}{n^2+n} \\right ) \\frac{4^n}{\\sqrt{\\pi n} } \\\\\n",
    "&=& \\displaystyle \\frac{1}{n} \\left (1  - \\frac{1}{n+1} \\right ) \\frac{4^n}{\\sqrt{\\pi n} } \\\\\n",
    "&=& \\displaystyle \\frac{4^n}{\\sqrt{\\pi}n^{3/2}} (1 + O(1/n))\n",
    "\\end{array}\n",
    "$$"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
